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CN ■ Abstract 

We consider the problem of non-uniform vote aggregation, and in particular, the algorith- 
mic aspects associated with the aggregation process. For a novel class of weighted distance 



measures on votes, we present two different aggregation methods. The first algorithm is based 
on approximating the weighted distance measure by Spearman's footrule distance, with prov- 
able constant approximation guarantees. The second algorithm is based on a non- uniform 
Markov chain method inspired by PageRank, for which currently only heuristic guarantees 
Q ' are known. We illustrate the performance of the proposed algorithms on a number of distance 

c/2 ■ measures for which the optimal solution may be easily computed. 

i o i i 

1 Introduction 

m 

Tj- ' Vote (rank) aggregation has a long history, dating back to the first democratic elections held in 
the polis of Athens under Solon and Cleisthenis pp. The early voting process involved ranking two 
candidates, so that the problem of determining the winner reduced to simple plurality vote counts. 
In more recent political history, it was recognized that plurality methods, as well as majority 
pairwise counts for multiple candidate ranking systems are plagued by a number of issues. These 
issues were most succinctly identified by de Borda and Condorcet [2J, and pertain to the fact that 
votes may be non-transitive, that "strong candidates" may loose out to "weak candidates" due to 
their close mutual competition, and that majority pairwise counts may differ substantially from 
C$ . plurality counts. 

The above described issues with aggregating multiple votes/rankings led to a line of work 
centered around the use of distance measures between rankings [3j and an underlying axiomatic 
approach |3]. The idea behind the distance-based approach is that one can cast the aggregation 
problem as one of evaluating the median of a set of points in a given metric space. Well-known 
metrics used for computing the median include Kendall's r and Spearman's footrule [5]. 

One of the drawbacks of aforementioned distance-based aggregation methods is that the dis- 
tance functions do not cater to the need of certain applications where similar items are to be 
treated similarly in the aggregation process, and where the top vs. the bottom of the list carry dif- 
ferent relevance in the ranking [6]. An example for the first scenario may be in ranking candidates 
for a number of positions, with the constraint that some candidate diversity criteria are met. An 
example for the second scenario may be in ranking candidates where only a small fraction at the 
top is considered for a position. 

In two companion papers [Tp] , we studied a very general class of distance measures, termed cost- 
constrained transposition distances, or weighted transposition distance, which, among other things, 
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address the two aggregation issues described above. The crux of this new approach to measuring 
distances between rankings is to assign non-uniform swapping costs (weights) to different pairs of 
locations in the list, or equivalently, different pairs of elements in the inverse list. Aggregation 
methods based on weighted transpositions are currently unknown, and the topic of interest in this 
paper. 

The results we present pertain to algorithmic aspects of the weighted vote/rank aggregation 
problem [9HT2]. We describe three algorithms: a constant-approximation algorithm that uses an 
analytical bound between the weighted distance and a generalization of Spearman's footrule, and 
then solves a minimum weight matching problem; this algorithm is inspired by a procedure de- 
scribed in [T3l[T4] : an aggregation method reminiscent of PageRank [13], where the "hyperlink 
probabilities" are chosen according to the swapping weights; and a combination of the first algo- 
rithm with local descent methods. 

The paper is organized as follows. A brief introduction to vote aggregation and the problem 
formulation are given in Section |2j The main contribution of the paper is presented in Section [3l 
which contains the proposed aggregation algorithms. Results of various rank aggregation processes 
on an Academic Climate Study dataset gathered at UIUC are presented in Section |U 

2 Preliminaries 

Suppose that an election process includes m voters, each of which provides a ranking of n can- 
didates. These rankings are collected in a set E = {01,02, ••• ,cr m }, where each ranking Oi is 
represented by a permutation in S n , the symmetric group of order n. 

Given a distance function d over the permutations in § n , the distance-based aggregation prob- 
lem can be stated as follows. Find the ranking it* that minimizes the cumulative distance from E, 
i.e., 

m 

7r* = arg min >^ d(7r, 0j). (1) 

i=l 

Clearly, the choice of the distance function d is an important feature of the aggregation method. 
We describe next a few such distance measures, including Kendall's r and Spearman's footrule 
distance [9]. 

Let e = 12 ■ • • n denote the identity permutation (ranking). 

Definition 1. A transposition (a b) in a permutation tt is the swap of elements in positions a and 
b. When there is no confusion, we consider a transposition to be a permutation. If \a — b\ — 1, the 
transposition is referred to as an adjacent transposition. 

It is well-known that any permutation may be reduced to the identity via transpositions or 
adjacent transpositions. The former process is referred to as sorting, while the later is known 
as sorting with adjacent transpositions. The smallest number of transpositions needed to sort a 
permutation tt is known as the Cayley distance, T(e, 7r), while the smallest number of adjacent 
transpositions needed to sort a permutation is known as the Kendall's r distance, K(e, tt). 

Let O = {(a b) : a 7^ b, a, b G [n]} be the set of transpositions, and endow 6 with a non- 
negative weight function if : Q — > IR + where ip(a,b) is the weight of transposition (a b). The 
distance measure of interest is defined as the minimum weight of a sequence of transpositions 
needed to transform one permutation tt into another permutation a. This distance measure is 
termed the weighted transposition distance, and is denoted by d ¥ ,(7r,0) [7j. It can easily be shown 
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that most distance measures used for rank aggregation represent special cases of the weighted 
transposition distance: 

• Kendall's r, K(n,a) = d VK (ir,a), where (picihj) — 1? f° r \i ~ j\ — 1 an d ficihj) = oo 
otherwise. 

• Spearman's footrule, F(tt, a), defined as Y^i=i l 71 " -1 ^) — a ~ ma y ^ e written as F(n, a) = 
d VF (n,<r), where (p F (hj) = \i - j\. 

• Cayley's distance, T(ir,a) = d^ T (7r, a), where iprihj) — 1 f° r all 

In what follows, we focus on the weighted Kendall distance, where ip(i, i + 1) = ip(i + = Wi, 
with Wi being non-negative, and <p(i,j) = oo for \j — i\ ^ 1. 

The weighted Kendall distance between two permutations addresses the issue of the top-versus- 
bottom problem as follows. To model the significance of the top of the list versus the bottom of 
the list, one may choose Wi = n — i. This means that the weight of swapping the first and 
the second rank (location) is n — 1 while the weight of swapping the n — 1 st and the n th rank 
(location) is 1. In this case, transposition weights decay arithmetically as we move towards the 
end of the list. We may also choose Wi = c l for < c < 1. In this case, the weight decay is 
geometric. Weighted transposition distance, in its general form, can be used to model similarities 
between elements by assigning small weights to transpositions involving similar elements and large 
weights to transpositions involving dissimilar elements. Note that in this case, the weights relate 
to transposing elements and not ranks, and thus the distance between two permutations it and a 
is defined as d^ijt^ 1 ,a~ r ). 

In the next section, we describe how to perform efficient (approximate) rank aggregation using 
the weighted Kendall distance. Our results are inspired by related algorithmic approaches proposed 
in [B]. 

3 Algorithmic Results 

Rank aggregation is a combinatorial optimization problem over the set of permutations, and as 
such, it is computationally costly. Aggregation with Kendall's r distance is known to be NP- 
hard [13]. However, assuming that tt* is the solution to (JTJ, the ranking <7j closest to 7r* provides a 
2-approximation for the rank aggregate. This easily follows from the fact that Kendall's r satisfies 
the triangle inequality. As a result, one only has to evaluate the pairwise distances of the votes 
E in order to identify a constant approximation aggregate for the problem. Although we do not 
provide a detailed proof of this claim, the same is true of the weighted Kendall distance. 

3.1 Minimum Weight Bipartite Matching Algorithms 

For any distance function that may be written as d(7r,cr) = S^/f^'f^ff"^)), where / 
denotes an arbitrary non-negative function, one can find an exact solution to ([[]) as follows |13| . 
Consider a complete weighted bipartite graph Q = (X, Y), with X = {1, 2, • ■ ■ , n} corresponding 
to the n ranks to be filled and Y — {1, 2, • • • , n} corresponding to the elements of [n], i.e., the 
candidates. Let (i,j) denote an edge between i e I and j G Y. We say that a perfect bipartite 
matching P corresponds to a permutation it whenever (i,j) G P if and only if ir(i) = j. If the 
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weight of equals 

m 

E/frT'O)). (2) 
1=1 

i.e., the weight incurred by = j, then the minimum weight perfect matching corresponds to a 
solution of (CQ). 

For example, if if is a metric path weight function^, we have 

n 

i=i 

n 

= E/(T- 1 a)^" 1 (7)) 

where f — (p. 

Furthermore, note that ^fe=i /( 7r_ 17 (^)) * s a generalization of Spearman's footrule and 
thus for Spearman's footrule, one may find the exact solution as well. 

Let % denote a complete, undirected graph with vertex set V — [n]. To each edge (i, j), i, j e [n] 
assign the weight if(i,j). Furthermore, let p*(i,j) denote the shortest path (i.e., minimum weight 
path) between i and j in H, and let weight (p*) stand for the weight of the shortest path. 

Theorem 2. For any non-negative weight function if, we have 

(1/2)D(tt,<7) < d^(7r,a) < 2D(n,a) 

where 

n 

D(7T,a) = ^weight (p^tt- 1 ^),^ 1 ^))) . 
i=i 

Due to space limitations, the proof is omitted. 

Proposition 3. Let tt' = argmin,,- Ym=i ^{^i a i) an d 7r * = argmin,,- YjI=i ^^i 71 -! a i)- The permuta- 
tion tt' is a 4- approximation to the optimal rank aggregate tt* . If if is a metric, or if it corresponds 
to weighted Kendall distance, then tt' is a 2- approximation of tt* . 

Proof. The first part of the proof follows from the simple observation that 

m m 

i=i i=i 

and 

in m 

J2^(rr',a l )<2j2D(TT', ( T l ). 
i=i i=i 
So, by optimality of tt 1 with respect to D, 

m m 
Ed^TlVi) <4^d V3 (7T*, ( 7 i ) 
1=1 1=1 

and thus tt' provides a 4- approximation. The other claim may be proved similarly, by referring to 
the results of [7]. □ 



1 A metric path weight is a weight function obtained by arranging the elements in [n] on a straight path and by 
assigning non-negative weights to the edges of the path. The weight of transposing elements in positions a and 6, 
ip(a, b), is the weight of the unique path between a and b. 
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The permutation it' can be obtained using minimum weight bipartite matching by letting 



f(ij) = weight (p*(i,j)) ■ 
In particular, for weighted Kendall distance, we let 

i-i 

/(U) = 5>(J,/ + 1). 

i=i 

A simple approach for improving the performance of matching based algorithms is to couple 
them with local descent methods. More specifically, the local descent method works as follows. As- 
sume that an estimate of the aggregate at step £ equals ir™ . Let Q a = {(k k + 1) : k — 1, . . . , n — 1} 
be the set of all adjacent transpositions. Then 

m 

/ +1 '=/)argminyd(/)r, ff! ). 

1=1 

The search terminates when the cumulative distance of the aggregate from the set £ cannot 
be decreased further. We choose the starting point to be the ranking tc' obtained by the 
minimum weight bipartite matching algorithm. This method will henceforth be referred to as 
Bipartite Matching with Local Search (BMLS). 

3.2 Vote Aggregation using PageRank 

For a ranking tc 6 S n and a, b e [n], tt is said to rank a before b if Tc^ 1 (a) < 7r _1 (6). We denote this 
relationship with a < n b. The notation a < n b has a similar meaning, and is used in the case that 
one allows b = a. 

An algorithm for rank aggregation based on PageRank and HITS algorithms for ranking web 
pages was proposed in [13]. PageRank is one of the most important algorithms developed for search 
engines used by Google, with the aim of scoring web-pages based on their relevance. Each webpage 
that has hyperlinks to other webpages is considered as a voter, while the voter's preferences for 
candidates is expressed via the hyperlinks. When a hyperlink to a webpage is not present, it 
is assumed that the voter does not support the given candidate's webpage. The ranking of the 
webpages is obtained by computing the equilibrium distribution of the chain, and ordering the 
pages according to the values of their probabilities. The connectivity of the Markov chain provides 
transitive information about pairwise candidate preferences, and states with high input probability 
correspond to candidates ranked highly in a large number of lists. 

This idea can be easily adapted to the rank aggregation scenario in several different settings. In 
such an adaptation, it is assumed that the states of the Markov chain correspond to the candidates 
to be voted on and that the transition probabilities are functions of the votes. Dwork et al. [T3j 
[H] proposed four different ways for computing the transition probabilities from the votes. For 
completeness, we briefly describe the methods below before we proceed to describe a new approach 
for evaluating the transition probabilities for the case of the weighted Kendall distance. 

Let P denote the state transition probability matrix of the chain, with denoting the prob- 
ability of going from state (candidate) i to state j. Furthermore, let 




1, if j <a i, 

0, otherwise 
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and 

That is, CKjj is the number of voters that ranked candidate j at least as high as candidate %. 
In the first case (Case 1), the transition probabilities are computed according to 

_ Ijoij > 0) 

lJ £*/(«*><>)' 

where I(x > 0) equals 1 if x > and equals otherwise. In the second scenario (Case 2) the 
probabilities are set to 

with 

Pij{cr) ~ 



For Case 3, the transition probabilities are evaluated as 



m 

with Pij{cr) = — — for j < CT % and Pu(a) = 1 — ^ J a ' J — . The fourth method (Case 4) differs from 
the previous methods in so far that it uses transition probabilities based on majority votes, and 
will not be used in our subsequent studies. 

Our Markov chain model for weighted Kendall distance is similar to Case 3, with a major 
modification that includes incorporating transposition weights into the transition probabilities. To 
accomplish this task, we proceed as follows. 

Let Wk = (p(k, k + 1), and let i a = (T _1 (i) for candidate i, i — 1, • • • , n. Note that i a > j a if and 
only if % > a j. For / > k, let 

i-i 

w(k : I) = y^ jWh 

h=k 

denote the sum of the weights of transpositions (k k + 1), (k + 1 k + 2), • • • , (I — 1 I). We set 

/W) = , max. (3) 



if j a < i a , (3ij(<j) = if j„ > i a , and 



The transition probabilities equal 
with 



m 
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Method 


Aggregate ranking and average distance 


w = [1,0,0,0 


w= [1,1,1,1 


w = [1,1, o,o; 


w = [0,1,0,0] 


OPT 


[1,4,3,2,5], 0.7273 


[2,3,4,5,1], 2.3636 




2,3,4,5,1 


, 1.455 




3,2,5,4,1 


, 0.636 


BMLS 


[1,2,3,4,5], 0.7273 


[2,3,4,5,1], 2.3636 




2,3,1,5,4 


, 1.455 




2,3,1,5,4 


, 0.636 


MC 


[1,2,5,4,3], 0.7273 


[2,3,4,5,1], 2.3636 




2,1,3,4,5 


, 1.546 




2,3,1,4,5 


, 0.636 



Table 1: The aggregate rankings and the average distance of the aggregate ranking from the votes 
for different weight functions w. 



Intuitively, the transition probabilities described above may be interpreted in the following 
manner. The transition probabilities are obtained by averaging the transitions probabilities corre- 
sponding to individual votes a G S. For each vote a, let us first consider the case j a — i a — 1. In 
this case, the probability of going from candidate i to candidate j is proportional to Wj a = (p(j a , 
This implies that if Wj a > 0, one moves from candidate i to candidate j with positive probability. 
Furthermore, larger values for Wj a result in higher probabilities for moving from i to j. 

Next, consider a candidate k with k a = i a — 2. In this case, it seems reasonable to let the 
probability of transitioning from candidate % to candidate k be proportional to W; " T ^ Wka , However, 
since k is ranked before j by vote a, it is natural to require that the probability of moving to 
candidate k from candidate i is at least as high as the probability of moving to candidate j from 
candidate i. This reasoning leads to = max{w J(T , Wj,r 2 Wk,T } and motivates using the maximum 
in ([3]). Finally, the probability of staying with candidate i is proportional to the sum of the /3's 
from candidates placed below candidate i. 

4 Results 

The performance of the Markov chain approaches described above cannot be evaluated analytically. 
A common approach when dealing with heuristic methods for hard combinatorial optimization 
problems is to test the performance of the scheme on examples for which the optimal solutions are 
easy to evaluate in closed form. 

In what follows, we evaluate the various aggregation approaches on a simple test example, with 
m = 11. The set of votes (rankings) E is given in matrix form by 

/1 1122334455\ 

22233222222 

33344445533 . 

44455553344 
\55511111111/ 

Here, each column corresponds to a vote, e.g., o"i = [1,2,3,4,5]. Let us consider candidates 1 and 
2. Using a plurality rule, one would arrive at the conclusion that candidate 1 should be the winner, 
given that 1 appears most often at the top of the list. Under a number of other aggregation rules, 
including Kendall's r and Borda's method, candidate 2 would be the winner. 
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Group 


Method 


Aggregate Ranking 


Average Distance 


Graduate (28) 


BMLS 


10, 12, 9, 8, 1, 3, 2, 11, 7, 4, 6, 5 


5.0918 


MC 


10,12, 9, 8, 1, 11, 3, 2, 7, 5, 4, 6 


5.1087 


Undergrad (73) 


BMLS 


12, 9, 8, 1, 3, 10, 4, 2, 11, 7, 5, 6 


5.4044 


MC 


12, 9, 8, 1, 3, 10, 4, 7, 2, 11, 5, 6 


5.4106 



Table 2: Aggregate rankings for undergraduate and graduate students. 



Group 


Method 


Aggregate Ranking 


Average Distance 


Female, Undergrad (32) 


BMLS 


12, 9, 1, 8, 3, 7, 4, 10, 2, 5, 11, 6 


5.3218 


MC 


12, 9, 8, 1, 3, 10, 7, 2, 5, 4, 11, 6 


5.3634 


Male, Undergrad (31) 


BMLS 


12, 9, 8, 3, 1, 10, 11, 7, 4, 2, 5, 6 


5.3457 


MC 


12, 9, 8, 10, 1, 3, 11, 2, 7, 4, 5, 6 


5.421 


DNI, Undergrad (10) 


BMLS 


8, 12, 4, 1, 3, 9, 7, 2, 10, 11, 6, 5 


4.2796 


MC 


12, 8, 4, 1, 3, 9, 10, 11, 7, 2, 6, 5 


4.4338 



Table 3: Aggregate rankings for female and male students. 



Our goal is to see how the distance based rank aggregation algorithms would position these two 
candidates. The numerical results regarding this example are presented in Table [TJ In the tables, 
OPT refers to the optimum solution which was found by exhaustive search and MC refers to the 
Markov chain method. Furthermore, minimum weight bipartite matching is obtained using |15| . 

If the weight function is tin 1 ) = [^ o, 0, 0], the optimal aggregate vote clearly corresponds to the 
plurality winner. That is, the winner is the candidate with most voters ranking him/her as the top 
candidate. A quick check of Table [T] reveals that all three methods identify the winner correctly. 
Note that the ranks of candidates other than candidate 1 obtained by the different methods are 
different, however this does not affect the distance between the aggregate ranking and the votes. 

The next weight function that we consider is the uniform weig ht function, w {u) = [1, 1, 1, 1]. 
This weight function corresponds to the conventional Kendall's r distance. As shown in Table HJ all 
three methods produce [2, 3, 4, 5, 1] , and the aggregates returned by BMLS and MC are optimum. 

The weight function w^ 2 ' = [1,1,0,0] corresponds to ranking of the top 2 candidates. OPT 
and BMLS return 2, 3 as the top two candidates, while both preferring 2 to 3. The MC method, 
however, returns 2, 1 as the top two candidates, with a preference to 2 over 1, and a suboptimal 
cumulative distance. It should be noted that the the MC method is not designed to only minimize 
the average distance. Another important factor in determining the winners via the MC method 
is that "winning against strong candidates makes one strong". In this example, candidate 1 beats 
the strongest candidate, candidate 2, three times, while candidate 3 beats candidate 2 only twice 
and this fact seems to be the reason for the MC algorithm to prefer candidate 1 to candidate 3. 
Nevertheless, the equilibrium probabilities of candidates 1 and 3 obtained by the MC method are 
very close to each other, as the vector of probabilities is [ 0.137 , 0.555, 0.132 , 0.0883, 0.0877]. 

The weight function, y/* 2 ) = [0, 1, 0, 0], corresponds to identifying the top 2 candidates (it is not 
important which candidate is the first and which is the second.) The OPT and BMLS identify {2, 3} 
as the top two candidates. The MC method returns the stationary probabilities [0, 1, 0, 0, 0] which 
means that candidate 2 is an absorbing state in the Markov chain. This occurs because candidate 
2 is ranked first or second by all voters. The existence of absorbing states is a drawback of Markov 
chain methods. One solution is to remove 2 from the votes and reapply MC. The MC method in this 
case results in the stationary distribution [p (1) ,p (3) ,p (4) ,p (5)] = [0.273,0.364,0.182,0.182], 
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which gives us the ranking [3,1,4,5]. Together with the fact that candidate 2 is the strongest 
candidate, we obtain the ranking [2, 3, 1, 4, 5]. 

Equipped with this insight, we now perform an aggregation study on a set of rankings collected 
from UIUC undergraduate and graduate students, pertaining to criteria for the quality of academic 
experience (University Climate Study Data), listed below. The weight function w = [w±, • • • , w n -i] 
was chosen as Wi = (3/4)* -1 , % — 1, • • • , n — 1. 

1. Campus friendliness and inclusiveness 

2. Availability of recreational and cultural facilities 

3. Quality of classrooms and dorms 

4. Extracurricular student groups and activities 

5. Geographical proximity to family/partner 

6. Commitment of campus to build a diverse community 

7. Being able to express one's personal identity freely 

8. Being able to make friends on campus 

9. Safety and security 

10. Availability of financial support/scholarship 

11. Availability of personal counseling/academic tutoring 

12. Friendliness/quality of faculty /instructors 

The results of the vote aggregation are presented in Tables |2] and [3j In Table [31 a group 
of 10 students who Did Not Indicate their sex is referred to as DNI. An interesting finding is 
that the most important criteria for undergraduate students is the effectiveness and friendliness of 
instructors. 

Acknowledgment: The authors are grateful to Tzu-Yueh Tseng for helping with the numerical 
simulations and to Eitan Yaakobi for useful discussions. The work was supported by the NSF grants 
CCF 0821910, CCF 0809895, and CCF 0939370. 
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